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Abstract 

In this paper, we consider the problem of treating linear regression equa- 
tion coefficients in the case of correlated predictors. It is shown that in gen- 
eral there are no natural ways of interpreting these coefficients similar to the 
case of single predictor. Nevertheless we suggest linear transformations of 
predictors, reducing multiple regression to a simple one and retaining the co- 
efficient at variable of interest. The new variable can be treated as the part of 
the old variable that has no linear statistical dependence on other presented 
variables. 

Keywords: Simple and Multiple Regression; Correlated Predictors; Interpre- 
tation of Regression Coefficients. 

1 Introduction 

Regression analysis is one of the main methods for studying dependency factors 
in diverse fields of inqui ry where use of statistical methods is expedient (see e.g. 



Draper and Smit h (1998)). The efficiency of its application depends on the model 
and the set of explanatory variables (predictors) chosen. The most popular re- 
gression model is described by a linear equation expressing the dependence of 
the mean value of the variable (response, outcome) to be explained on the set of 
predictors. 

The natural applicability domain of regression analysis is a case of continu- 
ous outcome and predictors. In this area, the classical regression analysis theory 
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provides a thorough description of outcome dependence on explanatory variables 
considered. Of most interest in the linear model are the coefficients at predictors. 
For example, in the simplest case of single predictor X\ and dependent variable 
Y the linear regression equation is given by 



y = b + 61X1 

Factor bi is proportional to the coefficient of correlation between response Y 
and predictor X\. Furthermore, bi represents an increase (or a decrease, if b\ 
is negative) in the mean of Y associated with a 1-unit increase in the value of 
X, X = x + 1 versus X = x. The sign of b x indicates the trend in the relationship 
between Y and X\. 

Such thorough information about the relationship between outcome and sin- 
gle explanatory variable makes one wish to treat the coefficients of a multiple 
regression equation in a similar manner. It is well known, however, that in linear 
multiple regression models such interpretation of regression coefficients i s not cor- 



rect if t here are cor r elatio ns among predictors (jDraper and Smithl(|1998|) . lNalimov 



d 1975b . lEhrenbergl dl975h \ vloreov er, in some pra c tical cases such interpretation 



is in conflict with common sense ( Varaksin et al.l (|2004|) . see below section H]) 



The unique case where interpretation of multiple regression equation coefficients 
is meaningful is pairwise statistical independence of predictors. Then multiple 
regression coefficients coincide with corr esponding simple regressi on coefficients 



for the outcome on a particular predictor (praper and Smithl ( 11998l) ). 

Thus, the presence of correlated predictors renders the identification of the 
biomedical meaning of multiple regression equation coefficients a difficult task. 
Association among predictors or among predictors and outcome leads to unpre- 
dictable changes in regression coefficients and results in a loss of meaning in each 
particular coefficient. 

Nevertheless we cannot confine ourselves to independent (uncorrected) vari- 
ables only, as in most applications of regression analysis there are important prob- 
lems with correlated predictors, e.g. various air pollution rates (see bellow section 
HI). Another important example is epidemiological studies (research into disease 
prevalence and its association with risk factors). Such factors as sex and age are 
invariably present in epidemiological data, being related to both other indepen- 
dent variables and outcome. These inherent variables which confuse the effect 
on the response and other predictors are called confounders . Taking into account 
confounders in data analysis presents a difficult problem that does not have any 
correct solution as yet. 
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In a range of biomedical applications of regression analysis, of major inter- 
est is some variable X% which is considered along with accompanying variables 
X 2 , X 3 , Xk (confounders). Upon finding a multiple regression equation that 
depends on all of these predictors one has to treat coefficient b\ standing at the 
principal predictor, with all other predictors adjusting the action of main variable 
Xi. We shall consider below a way to interpret b x in terms of simple regression of 
outcome on a new variable, X^. For simplicity, we shall discuss cases of two and 
three predictors. The general case may be considered in a similar way. 



2 Regression equation with two predictors 

Let us consider continuous variables Y, X\.X 2 and corresponding linear regres- 
sion equation for outcome Y on predictors X\.X 2 

y = b + b x x x + b 2 x 2 (1) 

As usual, we suppose that coefficients b ,bi,b 2 and other regression coeffi- 
cients below have been obtained by the least squares method. We assume that 
the (linear) dependence of response Y on predictor X\ is significant, so b\ ^ 0. 
Finally, let the linear regression equation with response X% and predictor X 2 be 
given by 

X\ = C120 + C\ 2 X 2 

We define a new variable, X*, in which the linear dependence of X\ on X 2 'is 
excluded' as follows 

X 1 — X 1 — C\ 2 X 2 

Let us build a simple regression equation describing the mean of outcome Y 
as a function of new predictor X\ 

y = a* l0 + a* x x\ (2) 

We have pairs of corresponding variables: X\ and x±, XI and x*. Obviously, 
these variables cannot be interchanged; in particular, variables X%, X* cannot be 
substituted in equations © and © instead of x\ and x*, respectively. If it were 
possible, one might transcribe equations © and © as 

y = b + b l ^Xi + ^x 2 ^J 
y = a* w + a* (x t - c 12 x 2 ) 
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Although these equations are different, they have the same slope, as follows 
from the following theorem. 

Theorem 1. In equation ([/]), coefficient b\ is equal to coefficient a\ in equation 
©, i.e. 

h = aj, (3) 

b 2 , 

and it is possible that — ^ — C12. 

01 

A similar statement holds for coefficients b 2 and a 2 , where a 2 is the coefficient 
at variable x 2 in a simple regression equation y = a 20 + a 2 x 2 , and a new variable 
X 2 is defined from the regression equation x 2 = c 2W + c 2 \Xi as X 2 = X 2 — c 21 Xi . 

Formal proof of Theorem |2]is provided in Appendix 1 . 

Now coefficient 61 of multiple regression equation © may be treated as fol- 
lows. Recall that b\ cannot be interpreted per se. But it is equal to coefficient a* of 
simple regression model ©. Hence we transform the problem of interpretation of 
61 into one of interpretation of a new variable X*. It is easy to check that X{ and 
X 2 are uncorrected. So one can say that variable XI is obtained from variable 
X\ by excluding the part of it that is linearly dependent on it. This does not mean 
that by constructing variable X\ we can split the contributions of X\ and X 2 to 
response Y. In fact, there is no way to do this given correlated predictors. 

Now consider a more general way to define variable X*, namely, let X{ = 
Xi — 7X2, where 7 is a real number, and pose the question: how many values 
may 7 take for equality ® to hold? In the case under consideration, we can 
express the dependence of a\ on parameter 7 in explicit form as follows 



X X Y - X x Y - 7 [X 2 Y - X 2 Y) 
a* (V) = - - I — J (4) 

1 1 J) var {X 1 ) - 2-fcov (X 1 ,X 2 ) + j 2 var (X 2 ) ' V ; 

where the bar over a symbol denotes the mean of the variable, var and cov stand 
for variance and covariance, respectively. 

Theorem 2. Equation a\ (7) = b\ has two solutions, videlicet 

b 2 

7i = C12, 72 = -7- 

For the proof of this theorem we refer the reader to Appendix 2 . 

Given the explicit expression for a\ (7) in formula (HJ), we can plot it (see 
Fig. CD where artificial data is used with b\ = 0.2918 which is drawn as a horizon- 
tal line). There are some general properties in a* (7): it is defined throughout the 
real axis, has two extrema, and the real axis is an asymptote to it. 
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Figure 1: Plot of regression coefficient al as a function of 7. 

3 Regression equation with three predictors 

Now consider the case of one outcome Y and three predictors Xi, X 2 , X 3 . The 
point of interest is predictor X\ the other predictors being confounders. We want 
to get an interpretation of coefficient b\ at the variable of interest in the multiple 
regression equation 

y = b + b x xi + b 2 x 2 + b 3 x 3 (5) 
We can introduce the regression equation of X\ on covariates X 2 ,X 3 : 

Xl = 0)123 + C 12 X 2 + C13X3, 

and define a new variable by the formula 

X* =X X - c 12 X 2 - c 13 X 3 (6) 
As in section 1, we could find a simple regression equation for Fon covariate X* 

y = a* 01 + a\xl (7) 

Similar to Theorem |2l we have the following statement. 

Theorem 3. Coefficient b\ of equation (0) is equal to coefficient a\ of equation 
([7|), that is 

b\ = a\ 

The proof of this theorem is given in Appendix 3. 
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Figure 2: Surface z — a* (72, 73) and plane z — h\. 



Going over to a more general case, we can define covariate X\ as follows 

X{ = Xi - 7 2 X 2 - 73-^3, 

where 72,73 are some real numbers. Then regression coefficient a\ becomes a 
function of two real variables 72,73- The shape of surface z = a* (72,73) is 
shown in Figure |2] (using simulated data with 61 = —2.031). 

As one can see from Fig. [Hand Fig. |2l the character of the dependence of a\ 
on corresponding parameter(s) in both cases is similar. The same is true of the 
general case. 



4 Applications to real data analysis 
4.1 Regression with two predictors 

Let us consider the use of Theorem |2] for investigating the dependency of in- 
cidence on various air pollution toxicants of C ity St. -Petersburg (Russia). The 
primary data were published in IScherbol(l2002l) . In the remainder of this section, 



we assume incidence to be incidence rate in the adult population (i.e. the number 
of disease cases per 1000 adult population a year) averaged over a 5-year obser- 
vation period. In the primary data, the rates of incidence were gathered across 
19 boroughs of St. -Petersburg. We consider toxicant concentrations as random 
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variables, i.e. mean toxicant concentration expressed in maximum concentration 
limit (MCL) terms and averaged over 5-year observation period. Each of these 
variables takes on 19 values in accordance with the number of boroughs. We 
denote these covariates by the usual chemical notations: CO, NO2, SO2, Pb etc. 
(the data consists of 12 pollutants). 

The simple linear regression equations of response y (incidence) on concen- 
trations of CO and N0 2 are given by 

Y = 603 + 579 CO (8) 



Y = 414 + 416 iV0 2 (9) 

According to equation ©, incidence increases by 579 cases per 1000 popula- 
tion at an increase in CO concentration by MCL unit a year. Equation © may be 
interpreted in the same way. In short, both CO and NO2 increase incidence. 

There is a tight positive correlation between predictors CO and N0 2 - Pear- 
son's correlation coefficient is 0.75, and the regression equation is 



CO = -0.131 + 0.576 N0 2 

This shows that growth in one toxicant is related to growth in another. Hence, 
one can conjecture that equation d8) does describe an increase in incidence at a 
simultaneous increase in both pollutants (CO and NO2). A question then arises: 
could one specify the 'pure' influence of each toxicant on incidence, separating 
the contribution of one toxicant from that of the other? 

To extract the contribution of each toxicant to the incidence in the presence of 
other toxicants, researchers often use a multiple regression equation including all 
toxicants. Such interpretation is commo n in some biologi cal and medical applica- 
tions of regression analysis. We refer to McNameel (|2005|) as a typical exposition. 
In the case under consideration, we obtain a multiple regression equation 



Y = 465 + 390 CO + 191 N0 2 



(10) 



A lot of authors consider the coefficients of a multiple regression equation 
obtained by means of the least squares method t o be meaningless if there ar e 
correlation s amo ng predictors (|Draper and Smithl (|1998|) . lAivazian et al.1 (|1985l) . 
Ehrenbergl (|1975l) ). These coefficients cannot be used to assess separately the 
dependence of Y on CO and Y on N0 2 ■ Nevertheless, there are other authors who 
treat each coefficient of a multiple regression equation as the contribution of an 
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individual t oxica nt to the outcome against the background of other toxicants (e.g. 



McNamed (I2005|) ). Moreover, this contribution has to be refined as compared 
to ®-©. Their supposition is that predictors as if distribute their influence on 
the outcome in a multiple regression equation so that each predictor describes its 
influence with the other being in the background. According to this viewpoint, 
the addition of another toxicant, N0 2 , to CO and change from © to (flOT ) should 
attenuate the effect of CO because the corresponding coefficient diminished from 
579 to 390. The same conclusion holds for NO2 and CO and equations © and 

These authors do not provide any substantive explanation for the biomedical 
meaning of variations in the coefficients in dSJ-® and (TTQb ; nor do they explain 
the refined contribution of each individual toxicant. Variations in regression co- 
efficients could be explained by going over from simple regressions © or © to 
multiple regression (fTOl ). Indeed, coefficient 61 = 390 in equation (flOl) is equal to 



coefficient a* in the simple regression equation 

Y = a* lQ + a\CO\ 
where covariate CO* is defined by 

CO* = CO - 0.576 N0 2 (11) 

By (fTTI) . predictor CO* is obtained from CO by excluding its part correlated 
with N0 2 . Then bi = 390 means an increased incidence rate at a growth in CO 
concentration excluding the linear statistical dependence of CO and N0 2 . 

One can similarly treat coefficient b 2 = 191 in (flOl) . It is equal to a 2 in the 
simple regression equation 

Y = a* 20 + a* 2 NO* 2 , 

where N0 2 is a part of toxicant N0 2 which contains no linear statistical depen- 
dence on CO. 

We seem to have obtained a consistent picture: by excluding the (linear) de- 
pendence of one toxicant on the other we arrive at a 'pure' influence of a particular 
factor on incidence. Since both factors increase the incidence, and the concentra- 
tion of each factor increases with growth in the other, one can anticipate that the 
magnitudes of the coefficients in equation (flOl ) should be less than in ©-([9]). This 
is exactly so in the case under consideration. 
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It is not as simple as that though. Let us consider the dependence of incidence 
Y on the concentrations of CO and S0 2 . A simple regression equation of Y on 
S0 2 is given by 



Y = 919 + 52 S0 2 

The association between CO and S0 2 is very similar to that between CO and 
N0 2 . For instance, the correlation coefficient is 0.73 and the regression equation 
is 

CO = 0.272 + 0.316 S0 2 (12) 



The multiple regression equation in the case considered is (fVaraksin et al.l (|2004)) 



Y = 634 + 1047 CO - 278 S0 2 (13) 



Assuming the coefficients of (|13l) to be refined ones we should treat the mag- 
nitude 1047 as a 'pure' influence of CO against the background of S0 2 , and 
—278 as a 'pure' influence of S0 2 against the background of CO. Obviously, 
such interpretation of regression coefficients is invalid, since the 'pure' influence 
of toxicant S0 2 becomes negative. The reason for such misinterpretation is the 
tight correlation between predictors CO and S0 2 . One has to take into account 
this correlation in treating regression coefficients. 

The coefficient at CO in (TTSl is twice as large as that in ([8]). By Theorem |2] 
coefficient bi = 1047 is equal to the slope in 

Y = 697+ 1047 CO*, (14) 

where CO* = CO — 0.316 S0 2 . In biomedical terms, we obtain an inexplicable 
picture: we have reduced the toxic burden on the population by removing one 
of the two toxicants, but the incidence grows with CO even more rapidly. In 
mathematical terms, we can explain this as follows. It is clear from the definition 
of CO* that its range is less than the range of CO. In both cases, the incidence is 
the same, which implies an increase in coefficient b\. Generally, inequality 61 > 
ai is impossible if we consider the multiple regression coefficients as refined ones. 
But if we refer to equality (1231 ), we can see that under a 2 <C a x and correlation 
coefficient r close to 1, inequality b\ > a± may hold true. The formula (1241 also 
explains the possibility of a negative value for coefficient b 2 . 
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4.2 Regression with three predictors 

Let us consider a regression equation of incidence Y on three predictors CO, N0 2 
and S0 2 . By the least square method, we obtain the equation 

Y= 494 + 857 CO + 194iV0 2 - 279 S0 2 
Equation © becomes 

CO =-0.108 + 0.386 N0 2 + 0.195 S0 2 

The new variable CO* is defined by ©, and the simple regression equation for Y 
on this predictor is given by 

Y= 1076 + 857 CO* 

We see that b\ = a* as well. 

Note that the correlation coefficient of model (fT2l is r = 0.74, and that of 
model (PT41) is r = 0.46. The latter is less than the coefficient of correlation be- 
tween incidence Y and CO (r = 0.58). 

5 Conclusion 

Let there be two regression equations for outcome Y 

y = a + aiXi 

and 

y = b + biXi + b 2 x 2 

If predictors X 1} X 2 are uncorrected, then a,i = b\. Hence, inequality a x ^ b\ 
is caused by the presence of correlation between the predictors. What is the epi- 
demiological meaning of changing coefficient a% to bi after adding predictor X 2 
to the simple regression model? Is the influence of the predictors on the outcome 
redistributed between them? The answer is definitely 'no'. Usually, the addi- 
tion of a second covariate is aimed at taking into account the combined effect of 
predictors on outcome. But what does 'take into account' mean? There are no 
reasonable explanations of this term. 

In view of Theorem |2] we can state that the addition of X 2 to regression equa- 
tion y = bo + biXi brings us to regression equation y = a* + a\x\. The new 
variable X{ contains no linear statistical dependence on X 2 . A similar interpreta- 
tion holds for the case of three variables as well as for the general one. 
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Appendix 1 

Proof of the Theorem |2] 

Let us first prove a technical statement, being of significance in its own right. Let 
there be a set of predictors Xi, X 2 , X k and let Y be an outcome. The values of 
p observations over predictors and the outcome combine into matrices X and Y 



X 



( 1 
1 



Xl2 



V i x 



x kl \ 

Xk2 

Xk P j 



Y 



( Yx \ 

Y 2 

\Y P J 



The first column contains unities so that we have the same formulae for cal- 
culating b in the same way as other &j. Let B denote the column of coefficients 
bo, &ij b 2 , ■ ■ ■ , bfr. To find a linear regression equation for response Y from predic- 
tors Xi, X 2 , ■ ■ ■ , Xk, we have to minimize the mean square residual of Y and X B 
i.e. 



mm(Y - X B)(Y - X Bf 



(15) 



where the T denotes matrix transposition. The problem (fT3T > has a unique solution 
under the usual lea st squares method assum ptions, e.g. if the matrix X T X is 
invertible (see e.g. Draper and Smith! (|1998h rChapter 5]). Such assumption will 
be needed throughout Appendix 1. 

Let T denote a nonsingular square matrix of order k 

( 7n 7i2 ••• 



V Ikl lk2 ■■■ 

and let C be a matrix of order (k + 1) X (k + 1) 
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c 



( 1 ... ^ 

7n 7i2 ••• lik 

V 7 fc i 7 fc2 ••■ 7fcfc / 



1 

o r 



Let us introduce a vector, X = (1, Xl, X 2 , X&), and consider linear trans- 
formation of variables Xi, X 2 , X^ by the matrix C 



Xn — X C 



(16) 



Thus, the new variables Xq = (1,X*,X%, ...,X£) obtained from variables 
X = (1, Xi, X 2 , Xk) by means of linear transformation are given by 



Finally, we denote by X* a matrix constructed from Xq in the same way as X 
from Xq, and B* stands for the column of coefficients b^bl, ...,b* k . 

Proposition 1. Let multiple regression equation of outcome Y on predictors 
X\,X 2 , X k be 



X ; 



i=0 



Then coefficients &q, &*,•■•> of the multiple regression equation for Y on predic- 
tors XI, X;,..., XI 



v = £&2 



x i 



k=0 



can be found from the matrix equality 

B* = C' 1 B 

It is easy to check that under condition (TT6l) we have 

X* = XC 



(17) 



To find a regression equation relative to new variables X* we need to solve the 
minimization problem 
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min(y - X* B*)(Y - X*B*) T 
Given equality (fTTT ), we have 

(Y - X*B*)(Y - X*B*) T = (Y — X C B*)(Y — X C B*) T 

Hence we obtain C B* = B, since the minimization problem (fT5T) has a unique 
solution. It is obvious from its definition that matrix C is invertible and 



C 1 — i , , , , _ | 



1 

o r 



This brings us to the end of the proof of the Proposition. 
Proof To prove Theorem |2] consider the case of two predictors X\, X 2 , and 
matrix C is equal to 



C 



( 1 





Then X\ — X\ — ci 2 X 2 and X% = X 2 . Applying the Proposition to matrix C, 
we obtain 



B* 



C- L B 





(18) 



It can be easily seen that X* and X 2 are uncorrelated (correlation coefficient 
is equal to zero). Therefore, coefficients b\, b 2 of the multiple regression equation 
for outcome Y on predictors X*, X 2 

y = b* + b\x\ + b* 2 x 2 

are equal to the corresponding coefficients of the simple regression equations for 
Y on predictors X{ and X 2 , respectively 



a^+a^xl, y = a 02 + a 2 x 2 



V — "oi 

b\ = a*, b\ = a 2 

Using (TT81) , we obtain b\ — b\, and combining this with (fT9l) we obtain b% 
a*, which finishes the proof. 



(19) 



13 



Appendix 2 



Proof of the Theorem |2] 

Let us consider linear transformation of variable X t 

X* = X X - ~/X 2 (20) 

Then in the regression equation 

y = a* 10 + a\xl (21) 

coefficient a\ becomes a function of parameter 7. Its explicit expression is given 
by 



X 1 Y-X 1 Y- 1 [X 2 Y-X 2 Y) 

ai ^ ~ var (Xi) - 2 7 cot» (X U X 2 ) + ^var (X 2 ) ' ( } 

Proof (of Theorem 2) Recall the following regression equations for outcome 
Y on predictors X\, X 2 (jointly and separately) 

y = b + hxx + 62^2 
y = a 01 + a 1 x 1 
y = a 02 + a 2 x 2 



and we introduce matrices 

A={a u a 2 ), J B=(fo 1 ,6 2 ), C = 

where are regression coefficients from the equations 

xi = con + c 12 x 2 
x 2 = C 21 + C 2l Xi 



1 C12 
c 2 i 1 



According to the theorem iPanov and Varaksinl (|2010b . we have equality A 



B ■ C. Now, suppose that C is an invertible matrix (the opposite case is discussed 
below in the Remark[5]). Then 

B = A- C- 1 

Thus we obtain the following representation of regression coefficients h x , b 2 (r 
denotes the correlation coefficient between X 1 ,X 2 ) 
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ai - a 2 c 2 i ai - a 2 c 2 i 

bx = - = — : 2— (23) 

1 - c 12 c 21 1 - r 2 

a 2 - a x cx2 a 2 - a x c 12 



1 - c 12 c 21 1 - r 2 
From f|23T>- f|24T> . it follows 



ai - a 2 c 2i = bi (1 — r 2 ) , a 2 - aiC i2 = 6 2 (1 — r 2 ) 
&1C12 = 02 — ^2, &2C 2 l = a\ — b\ 

_ a 2 b 2 _ a 2 — b 2 _ ax h _ a% — bx 

bx bx bx ' b 2 b 2 b 



(25) 



2 



9 oi a 2 ax a 2 9 ax a 2 cfi<z 2 
= Ci 2 c 2 i = — T- + 1, 1 - r = 1 - — + - 



&1&2 &i b 2 b t b 2 bxb 2 

If we equate the right hand side of (1231) to the right hand side of (T22l) . we obtain 
the roots of the equation a\ (7) = 61 (after some simplification) 



71,2 2(a x - a 2 c 21 )var (X 2 ) 

I 2cov (Xi,X 2 ) (ax - a 2 c 2l ) - a 2 var (X 2 ) (1 - c 12 c 21 ) ± \ 



4 (—02 + GS1C12) c 2 i-yar (Xi) var (X 2 ) (— ai + a 2 c 2 i) + 
\^ ^ (-2aiccw (Xi, X 2 ) + a 2 (2c 21 cov (Xx, X 2 ) + var (X 2 ) (1 - c 12 c 21 ))) 2 ) 

Applying (|25l) . we get 

1 



7i,2 



2var (X 2 ) bx (1 - r 2 ) 



/ 2ro (Xx) a (X 2 ) bx (1 - r 2 ) - a 2 var (X 2 ) (1 - r 2 ) ± \ 



46x63021 (1 - r 2 ) 2 var (X x ) var (X 2 ) + 



V V (-2axra (Xx) a (X 2 ) + a 2 (2c 21 ra (Xx) a (X 2 ) + var (X 2 ) (1 - r 2 ))) 2 j 



where a (Xi) = ^jvar (Xi). 

Next, we expand the second summand in the radicand and factor out the 
-2ra (Xx) a (X 2 ). After that, a x - a 2 c 21 is substituted by 61 (1 - r 2 ) (see (1231)). 
We get 

1 

71,2 ~ 2var (X 2 )bx (1 - r 2 ) 

(1 - r 2 ) (2ra (Xx) a (X 2 ) bx - a 2 var (X 2 )) ± 

^Abxb 2 c 21 (1 - r 2 ) 2 var (Xx) var (X 2 ) + (-2m (Xx) a (X 2 ) b x (1 - r 2 ) + a 2 var (X 2 ) (1 - r 2 )f 

15 



or 



71,2 



1 

2b~! 



(*i 



a(X 2 



— a 2 ± 



1 



war / , tr(Xi) 



Applying (1251) again, we obtain the required equalities 

7l,2 



Cl2 
Cl2 

That is 
or 



1 

" 2b[ 
26i 
2^ 



261C12 - a 2 ± V 46i6 2 ci2 + (2&1C12 - a 2 ) 



± 
± 



h ~ Cl2 J Cl2 + l Cl2 ~ 2^ 



'02 2,2 fl 2 . / a 2 

^ - Cl2 + Cl2 - ^ + [W, 



o 2 . 
Cl2 -2^ ± 



«2 



2&i 



7i = C12, 72 = C12 - 



a 2 

&7 



7i = C12, 72 = -7- 

Remark If correlation matrix C is singular, then r 2 = 1, i.e. predictors X 1; Jf 2 
are proportional. In this case, the problem of finding a multiple regression equa- 
tion on variables X 1} X 2 cannot be posed, since it leads to an inconsistent system 
of linear equations. 



Appendix 3 

Proof of the Theorem [3] 

The method of proving Theorem [3] as considered in this Appendix contains the 
main ideas of the proof of the general statement. 

Let the linear multiple regression equation for outcome Y on predictors X\ , X 2 , 

be 



y = b + hxi + b 2 x 2 + b 3 x 3 
We introduce new variable X^ by 



X* — X x - ~f 2 X 2 - 73X3, 
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where 72, 73 are some constants. So we perform linear transformation of predic- 
tors by the matrix 



C 



( 1 



V 




1 

-72 
-73 






1 / 



det (C) = 1 



The inverse matrix C 1 is equal to 



-1 



( 1 













1 











72 


1 





V 


73 





1 / 



c 



Hence the coefficients of the regression equation for Y on X\, X 2 , X 3 and that on 
X*, X2 = X 2 , X3 = X 3 are connected by (see Appendix 1) 



f b o\ 



h* 



f 1 




V 




1 

72 
73 





1 / 



fb \ 

b 2 
V h ) 



( 



\ 



bo 
bi 

72&1 + b 2 
V 73^i + h J 



(26) 



In particular, for arbitrary 72, 73 coefficients bi and b\ are equal. From now on we 

assume 72 = Ci 2 ,73 = ci 3 - 

Let us consider a simple regression equation for Y on X{ 



y = a* 01 + a\xl 

We have divided the proof of Theorem |3] into a sequence of lemmas. 

Lemma 1. The multiple correlation coefficient of variable X{ on predictors 
X 2 , X 3 is equal to zero. 

Proof Let p\. 2Z be the multiple correlation coefficient of X{ on variables 
X 2 ,X 3 . By its definition 




where |Corr| is the determinant of the correlation matrix of variables X^, X 2 , X 3 , 
Cn is the cofactor of the (1,1) entry of the matrix Corr. Therefore 
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\Corr\ 



1 r 12 r 13 
r 2 i 1 r 23 
r 3 i r 32 1 



1 r 23 
r 32 1 



Similar to the case of two predictors, one can see that r i2 = r 2i = r i3 = r 31 
Hence 



0. 



\Corr\ 



1 

1 r 23 
r 32 1 



C 



ii 



1 r 23 
r 32 1 



i.e. p\. 23 = 0. 

Lemma 2. Let us have multiple regression equations for outcome Y on vari- 
ables X*,X 2 



and that on variables X{ , X 3 



y = b' + b[xl + b' 2 x 2 



y = K + b'[x\ + blx 3 



(27) 



Also, consider simple regression equations for outcome Y on predictors X 2 and 
X 3 respectively 

y = a 02 + a 2 x 2 
y = a 03 + a 3 x 3 



Then 



Besides, 



b'=a 2 ,b" 



«3 



b[ = b'l = a{, (28) 

where a\ is the regression coefficient from equation (|2j. 

Proof. As it is mentioned above, covariates X*, X 2 are uncorrelated as well as 
X^, X 3 . Hence b' 2 = a 2l b 3 = a 3 . The last equality (1281) is implied by Theorem |2] 

Lemma 3. Let there be a multiple regression equation for variable X 3 X\,X 2 
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%3 



Then 



"0312 + a 3 l X l + "32^2 





a 



31 



For the multiple regression equation of X 2 on predictors X±, X 3 

X 2 = « 213 + tt21 X l + "23^3 

we have 



a 



21 







(29) 



Proof We obtain it by a tedious calculation. By the least squares method, 
coefficient a 31 can be obtained from a system of linear equations. The numerator 
of the expression for a 31 is the determinant 



1 

Xi - c 12 X 2 - C13X3 
X~ 2 



X, 



X]X 3 — ci 2 X 2 X 3 — Ci 3 Xf 



X 2 X3 

From corresponding systems of linear equations we obtain 



X2 

X 1 X 2 — ci 2 Xf — C13X2X3 
^2 

(30) 



X| 



C12 



1 

X2 

x 3 



X\X 2 
X\X 3 



X, 



x 2 x 3 

^2 
L 3 



J_ X2 

x 2 x| 

X3" 



^= =ry= ) c 13 



1 

X2 

x 3 



Xi 
XjX 2 
X2X3 X1X3 



X2 

x| 



X; 



X0X3 



X2X3 

xf 



J_ X2 

x 2 xf 

X3 



X, 



XoXs 



x 2 x 3 
xf 



After substituting these into (|30l and making necessary simplifications we thus 
obtain = 0. 



The second equality is proved in just the same way. 

Proof of Theorem |3] By (|26l ) we have b x = b\. Lemma|5]shows that b[ = b" = 
a* (ci 2 , C13). In what follow s we need an appropriate generalization of the theorem 
Panov and Varaksinl d2010h . It is provided below in Appendix 4. Applying it, we 
get 



7 * * 

b 3 a 31 



b[ = b\ 

This proves Theorem [3] since a 31 = by lemma[51 



(3D 
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Appendix 4 



A theorem on relationship among regression coefficients 

What follows is a stat ement of the theorem used in Appendix 3. 



Theorem (|Panov|) Let there be an outcome Y and a set of predictors Xi , X 2 , . . . , X, 



k- 



Consider a multiple regression equation for the outcome on the set of predictors 

k 

y = b + J2 b i x i ( 32 ) 

From the set of predictors Xi,X 2 , . . . , X k , we extract a subset {X h , X i2 , . . . , X im } 
and introduce regression equations for each predictor on the subset of predictors 
extracted 

m 

•£% Ci ~\~ ^ ^ Qjij^ij (33) 
i=i 

We suppose that a = 0, e M . = 6 iti . for i e {zi, i 2 , i m }- 
Finally, let there be a multiple regression equation for outcome Y on the set of 
predictors {X h , X i2 , . . . , X ik } 

k 

y = ao + J^a^Xi. (34) 
i=i 

Then 



i=l 

This theorem has been used in Appendix 3 as follows. The set of all predictors 
is {Xj*, X 2 , X%}, the extracted set of predictors is {X*, X 2 } , i\ = 1, i 2 = 2. Then 
(1321 ) becomes the equation (see [261) 

y = b* + b\x\ + b* 2 x 2 + b^x 3 
The (1331) transforms into (1291 ), and 

en = 1, C12 = 0, c 2 i = 0, c 22 = 1, c 3 i = a* 31 , c 32 = a 32 (36) 
The (El) is equation ([27]), and 
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a\ = b[ , a 2 = b' 2 

Thus (1351) becomes (for ai = 6^) 

fr'l = ^C U + 6^21 + &3 C 31 

Applying d36]), we obtain (l3TT) 

&£ = 6* + ^ 
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